Conversation
There was a problem hiding this comment.
Pull request overview
This PR splits OpenAI REST API handling into a shared base (OpenAIApiHandler) plus endpoint-specific handlers, moving /v3/responses logic into a dedicated OpenAIResponsesHandler and updating serving code, build targets, tests, and documentation accordingly.
Changes:
- Introduces
OpenAIApiHandlerbase class and refactors chat/completions handler to inherit from it. - Adds dedicated
OpenAIResponsesHandlerwith Responses-specific parsing + (streaming/unary) serialization and wires it into servables/calculator. - Updates Bazel targets, unit tests, and adds REST API documentation for
/v3/responses.
Reviewed changes
Copilot reviewed 14 out of 15 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/test/http_openai_handler_test.cpp |
Updates tests to construct correct handler type per endpoint and validates new streaming event sequence APIs. |
src/llm/visual_language_model/legacy/servable.cpp |
Instantiates OpenAIResponsesHandler for RESPONSES and adjusts streaming chunk emission behavior for empty chunks. |
src/llm/language_model/legacy/servable.cpp |
Same as above for legacy language model servable. |
src/llm/servable.hpp |
Switches execution context to store a polymorphic std::shared_ptr<OpenAIApiHandler>. |
src/llm/servable.cpp |
Creates responses vs chat handler based on endpoint; adjusts streaming chunk emission comments/behavior. |
src/llm/http_llm_calculator.cc |
Emits response.created lifecycle event for streaming via handler hook; uses generic serializeFailedEvent. |
src/llm/apis/openai_api_handler.hpp / src/llm/apis/openai_api_handler.cpp |
New shared base class with common parsing, tool parsing, usage, and shared utilities. |
src/llm/apis/openai_responses.hpp / src/llm/apis/openai_responses.cpp |
New Responses endpoint handler (request parsing + unary/streaming serialization). |
src/llm/apis/openai_completions.hpp / src/llm/apis/openai_completions.cpp |
Refactors chat/completions handler to inherit shared logic and removes embedded Responses logic. |
src/llm/BUILD |
Adds Bazel targets for the new handler libraries and wires dependencies. |
src/BUILD |
Links the new Responses handler library into the main build. |
docs/model_server_rest_api_responses.md |
Adds user documentation for /v3/responses, including streaming event semantics. |
| const auto firstMissmatch = std::mismatch(imageUrl.begin(), imageUrl.end(), allowedLocalMediaPath.value().begin(), allowedLocalMediaPath.value().end()); | ||
| if (firstMissmatch.second != allowedLocalMediaPath.value().end()) { | ||
| return absl::InvalidArgumentError("Given filepath is not subpath of allowed_local_media_path"); | ||
| } |
There was a problem hiding this comment.
Local file path allowlist check uses std::mismatch prefix matching, which can be bypassed by paths that share the same prefix (e.g., allowed "/tmp/allowed" would also allow "/tmp/allowed2/..."), and doesn’t canonicalize symlinks/relative segments. Consider using std::filesystem::weakly_canonical/lexically_normal and then verifying the canonical path is within allowedLocalMediaPath (including a path-separator boundary check).
| const auto createdAt = std::chrono::duration_cast<std::chrono::seconds>(created.time_since_epoch()).count(); | ||
| const std::string responseId = "resp-" + std::to_string(createdAt); | ||
| StringBuffer buffer; |
There was a problem hiding this comment.
Response IDs are derived only from seconds since epoch (e.g., "resp-"). Under concurrency, multiple requests created within the same second can collide and produce identical response IDs across different streams/unary calls. Consider using microseconds (as earlier streaming implementation did) or a random/UUID-based suffix to guarantee uniqueness while keeping the existing prefix format.
| const std::string type = typeIt->value.GetString(); | ||
| if (type == "input_text") { | ||
| auto textIt = contentObj.FindMember("text"); | ||
| if (textIt == contentObj.MemberEnd() || !textIt->value.IsString()) { | ||
| return absl::InvalidArgumentError("input_text requires a valid text field"); | ||
| } | ||
| contentText = textIt->value.GetString(); | ||
| } else if (type == "input_image") { |
There was a problem hiding this comment.
When input[i].content is an array, multiple input_text parts will overwrite contentText (only the last text part is preserved). The OpenAI Responses schema allows multiple content parts; if more than one input_text is provided, consider concatenating (or otherwise preserving ordering) instead of replacing, so the full prompt reaches the chat template/model.
🛠 Summary
JIRA/Issue if applicable.
Describe the changes.
🧪 Checklist
``